Normalization and Matching in the DORO System
نویسندگان
چکیده
This paper is concerned with the use of linguistically motivated phrases as indexing terms in Information Retrieval applications. Apart from the conventional noun phrases, we propose to use verb phrases as index terms for text classification. Techniques for phrase matching through syntactic normalization and semantical matching are described. We discuss the realization of the syntactic normalization of phrases by transduction to frames. Semantical normalization is based on lexico-semantical relations, taking into account certain properties of the classification algorithms used. The ideas described here are being implemented in the Document Routing system DORO, in which statistical learning algorithms are applied to document profiles consisting of phrases. This paper describes the rationale behind work in progress, rather than presenting final results.
منابع مشابه
Evaluation of the Parameters Involved in the Iris Recognition System
Biometric recognition is an automatic identification method which is based on unique features or characteristics possessed by human beings and Iris recognition has proved itself as one of the most reliable biometric methods available owing to the accuracy provided by its unique epigenetic patterns. The main steps in any iris recognition system are image acquisition, iris segmentation, iris norm...
متن کاملRobust Iris Recognition in Unconstrained Environments
A biometric system provides automatic identification of an individual based on a unique feature or characteristic possessed by him/her. Iris recognition (IR) is known to be the most reliable and accurate biometric identification system. The iris recognition system (IRS) consists of an automatic segmentation mechanism which is based on the Hough transform (HT). This paper presents a robust IRS i...
متن کاملMatching Scores of System Relevance and User-Oriented Relevance in SID, ISC and Google Scholar
Background and Aim: The main aim of Information storage and retrieval systems is keeping and retrieving the related information means providing the related documents with users’ needs or requests. This study aimed to answer this question that how much are the system relevance and User- Oriented relevance are matched in SID, SCI and Google Scholar databases. Method: In this study 15 keywords of ...
متن کاملVehicle Logo Recognition Using Image Matching and Textural Features
In recent years, automatic recognition of vehicle logos has become one of the important issues in modern cities. This is due to the unlimited increase of cars and transportation systems that make it impossible to be fully managed and monitored by human. In this research, an automatic real-time logo recognition system for moving cars is introduced based on histogram manipulation. In the proposed...
متن کاملUsing rule-based natural language processing to improve disease normalization in biomedical text
BACKGROUND AND OBJECTIVE In order for computers to extract useful information from unstructured text, a concept normalization system is needed to link relevant concepts in a text to sources that contain further information about the concept. Popular concept normalization tools in the biomedical field are dictionary-based. In this study we investigate the usefulness of natural language processin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999